Inference and quantile regression for the unit-exponentiated Lomax distribution

In probability theory and statistics, it is customary to employ unit distributions to explain practical variables having values between zero and one. This study suggests a brand-new distribution for modelling data on the unit interval called the unit-exponentiated Lomax (UEL) distribution. The statistical aspects of the UEL distribution are shown. The parameters corresponding to the proposed distribution are estimated using widely recognized estimation techniques, such as Bayesian, maximum product of spacing, and maximum likelihood. The effectiveness of the various estimators is assessed through a simulated scenario. Using mock jurors and food spending data sets, the UEL regression model is demonstrated as an alternative to unit-Weibull regression, beta regression, and the original linear regression models. Using Covid-19 data, the novel model outperforms certain other unit distributions according to different comparison criteria.


Introduction
Lomax [1] established the Lomax (L) or Pareto II distribution to model data on business failure. This distribution has found widespread use in a variety of domains, including income and wealth disparity [2,3], city size [4], the size distribution of computer files on servers [5], internet traffic [6], and receiver operating characteristic curve analysis [7]. Reference [8] claimed that the Lomax distribution is a good fit for modeling reliability issues because many of its characteristics can be understood in that context and could serve as an alternative to the well-known distributions used in reliability. Reference [9] utilized this distribution to model size spectrum data in aquatic ecology. Many descriptions are provided for the characterization of the Lomax distribution. It is referred to as a particular model of the Pearson Type VI distribution. It has also been thought of as a combination of the exponential and gamma distributions. The L model emerges as a limiting distribution of residual lifetimes at very old ages [10]. The L belongs to the family of decreasing failure rates in the context of lifetimes [11]. Reference [12] mentioned that the L distribution is a heavy-tailed alternative to the exponential, Weibull, and gamma distributions. Moreover, it belongs to the Burr family of distributions [13]. The cumulative distribution function (CDF) of the Lomax distribution is defined by: where, λ, δ 2 R + , are the scale and shape parameters respectively. The probability density function (PDF) of the Lomax distribution is as follows: From PDF (2), it is clear that the L distribution simplifies to: 1. The beta prime (or inverted beta) distribution for λ = 1, δ 6 ¼ 1.
To gain a better match for data analysis in different disciplines, modified and expanded variants of the L distribution were established. The next are a few of these broad generalizations that are highlighted; expoentiated L [14], Marshall Olkin L distribution [15], McDonald L distribution [16], Weibull L distribution [17], gamma L distribution [18], exponentiated Weibull L distribution [19], Maxwell L distribution [20], and Nadarajah-Haghighi L distribution [21].
The interest in the present study with the exponentiated L (EL) distribution with an extra shape parameter ϑ compared to (1). The EL distribution has the following CDF and PDF: where Θ � (δ, λ, ϑ) 2 R + . The EL model provides well-known distributions for certain specific values of parameters. It includes Lomax distribution for ϑ = 1, and exponentiated Pareto distribution for λ = 1.
Recently, the development of new flexible probability distributions to provide well-fitting models to datasets with values ranging from 0 to 1, has piqued statisticians' curiosity. These bounded distributions are necessary for modeling proportions, percentages, and probabilities. In applied disciplines, there is a strong need for the analysis of datasets on the (0, 1) for semi-parametric or parametric and regression models. the analysis of datasets on the (0, 1) for parametric, semi-parametric, and regression models is also in high demand. Furthermore, unit distributions allow extra flexibility throughout the unit interval without introducing new parameters to the fundamental distribution. The following are some of the most important unit distributions with diverse numbers of parameters: log-Lindley distribution [22], unit-Birnbaum-Saunders distribution [23], unit-Gompertz distribution [24], unitinverse Gaussian distribution [25], unit-Weibull distribution [26], unit-Burr-XII distribution [27], unit-Gamma/Gompertz distribution [28], unit exponentiated half logistic distribution [29], unit power Burr X distribution [30] and unit inverse exponentiated Weibull distribution [31].
In this study, in light of the above, an inverse-exponential transformation is used to generate a new unit-flexible probability distribution with a three-parameter based on the EL distribution. The new distribution, named the unit exponentiated Lomax (UEL) distribution, may be utilized to evaluate a wide range of datasets having a value from zero to one. A quantile regression model is developed based on the parametrization of the UEL distribution in terms of the μ th quantile. In light of the following facts, the UEL distribution is introduced: 1. To provide a novel distribution defined on (0,1) as a competitor to the existing bounded distributions; 2 . The new distribution can take different hazard rate shapes, such as constant, increasing,  decreasing, and bathtub; 3. It may be considered a model that is appropriate for fitting skewed data that may not be well fitted by other common distributions and can be used to address a wide range of issues in many fields; 4. To investigate key statistical features of the UEL distribution, such as entropy measures, probability-weighted moments (PWMs), moments, stress-strength (S-S) reliability, and incomplete moments (IM); 5. To investigate inferential features of the UEL distribution parameters using widespread estimation methodologies, such as maximum likelihood (ML), a maximum product of spacing (MPS), and Bayesian; 6. To examine the performance of the parameters using a simulation methodology; 7. Three real-data applications are examined: the first and second data sets are concerned with quantile regression modeling, while the third data set is concerned with data modeling.
The following is a description of the paper's structure. Section 2 defines the suggested distribution. Section 3 describes its essential distributional features. The methodologies for estimating unknown parameters using various estimation approaches are covered in Section 4. In Section 5, a simulation analysis is conducted to assess the parameter estimates. The new quantile regression model based on the newly specified distribution is described in Section 6. Two real-world applications employing the suggested quantile regression model alongside additional famous regression models are shown in Section 7. The application of the UEL distribution to Covid-19 data in Section 7 reveals that it is preferred to the other seven unit distributions. In Section 8, the paper comes to a close.

Model description
In this section, a restriction of the EL distribution in the unit interval is done to introduce a new bounded distribution with support on (0, 1) referring to the UEL distribution. Several extension\ modifications have been done using some transformation, in the present work, a similar exponential transformation is used as provided in References [24][25][26][27][28][29][30][31].
Suppose that Y = e −X , where X is the EL distribution, then the CDF of the UEL distribution is determined as follows: Based on the previous equation, the CDF and PDF of the UEL distribution are provided in the following definition.
Definition: A random variable Y is said to follow the UEL distribution with a set of parameters Θ � (δ, λ, ϑ) 2 R + , if it's CDF and PDF, are given by: and f ðy; YÞ ¼ ldW y ð1 À l lnðyÞÞ Note that, F(y; Θ) = 0, for y � 0, and F(y; Θ) = 1, for y > 1. The survival function and hazard rate function (HRF) are as follows: The density and HRF plots of the UEL distribution are given in Fig 1 for specific values of parameters.
From Fig 1, it can be concluded that the UEL density takes a variety of forms, such as leftskewed, U-shaped, unimodal, and J-shaped. Also, the HRF can be increasing, decreasing, Jshaped, or bathtub-shaped.

Characteristics of the UEL distribution
Specific of the UEL distribution's structural features, such as PWM, moments and IM, quantile function, some entropy measures, and S-S reliability, are investigated in this section.

Probability weighted moments
The PWM is typically thought to be preferable to ordinary moments. The PWM is less sensitive to extreme values. When ML estimators are difficult to obtain, they are occasionally utilized. The class of PWM, denoted by C h,s , for a random variable Y, is characterized as follows: Substituting (5) and (6) in (7), then the PWM of the UEL distribution is Using the binomial expansion, where s is a positive integer, then C h,s , is as follows: Using the following binomial expansion in the last term of (8) Inserting (9) in (8) leads to y hÀ 1 ð1 À l lnðyÞÞ À dðmþ1ÞÀ 1 dy: Let z = −λ ln(y), then the UEL distribution's PWM is: Using exponential expansion, in the previous equation, then the PWM of the UEL distribution becomes where B(.,.) stands for beta function (BF).

Moments & associated measures
If Y has the PDF (6), then the s th moment is derived as Let z = −λ ln(y), and use the exponential expansion, then the previous equation is The s th central moment, say μ s , of a given random variable Y, is defined by: , of the first four moments, variance (σ 2 ), coefficient of skewness (CS) and coefficient of kurtosis (CK) of the UEL distribution are listed in Table 1.
From Table 1, it can be concluded that the moment values decrease with increasing value of ϑ, for fixed values of λ, δ. Also, according to the values of CS and CK, the distribution is rightskewed, platykurtic, and leptokurtic.

Quantile function
For q 2 (0, 1), the quantile function (QF) of Y is obtained by inverting (5) as follows: which provides; The first quantile, median and third quantile are obtained, respectively, by setting q = 0.25, 0.5, and 0.75 in (11). It's simple to simulate the random variable. The random variable Y = y q at q follows (5), if Q is a uniform variate (0, 1).

Incomplete moments
The UEL distribution's s th lower IM, is given by: Using binomial expansion and let z = −λ ln(y), then η s (t), can be written as: Also, we use the exponential expansion, then η s (t), is as follows: , is as follows: where B(.,.,t) stands for incomplete BF. The Lorenz curve, defined by L(t)¼ Z 1 ðtÞ=m 0 1 is notable applications of the first IM. These curves are particularly important in the fields of economics, demography, insurance, etc.

Information measures
Here, certain uncertainty measures are investigated including Rényi (Ré) entropy, Havrda and Charvat (H-C) entropy, and ω− entropy. The Ré entropy of a random variable represents a measure of the variation of the uncertainty. The Ré entropy of a random variable Y has the UEL distribution defined by: Substituting (6) in (12), then Using the binomial expansion, then E R (ω) is converted to Let z = −λ ln(y), and expand exponentially, then the Ré entropy of the UEL distribution The H-C entropy measure of the UEL distribution is given by The ω− entropy measure of the UEL distribution is obtained as follows: Table 2 gives some numerical values of E R (ω), H R (ω), and ξ R (ω) for the same selected parameter values provided in Table 1.

Stress-strength model
In statistical literature, the term "S-S reliability" is typically represented as The expression comes from a basic scenario in which a system with random strength Y 1 is exposed to random stress Y 2 , with the system failing if the stress exceeds the strength. Let Y 1 and Y 2 , are independent random variables with UEL (λ, δ, ϑ 1 ), and UEL (λ, δ, ϑ 2 ) distributions. The S-S reliability of the UEL distribution is then determined as follows: Then S-S reliability is obtained as follows

Parameter estimation of the UEL model
The parameter estimators of the UEL distribution based on ML, MPS, and Bayesian estimation methods are discussed in this section.

ML method
Let y 1 ,. . .,y n be the observed samples from the UEL distribution with parameters δ, λ, and ϑ. The likelihood function (LF), say L (y|Θ) of the UEL distribution is expressed as: Then the ln of LF, say ℓ, of the UEL distribution is: The ML equations, which are based on (15), are therefore as follows: Solving the non-linear equations @ℓ/@δ = 0, @ℓ/@λ = 0, and @ℓ/@ϑ = 0, numerically using optimization algorithm as conjugate-gradient optimization, the ML estimators of δ, λ, and ϑ are obtained.

MPS method
The MPS technique, which is an alternative to the ML methodology, offers a parameter estimate of a continuous distribution. Suppose that y (1) ,. . .,y (n) be the observed ordered samples from the UEL distribution with parameters δ, λ, and ϑ. The MPS estimators of δ, λ, and ϑ are generated by maximizing the following: Solving the non-linear equations @ℓ(g)/@δ = 0, @ℓ(g)/@λ = 0, and @ℓ(g)/@ϑ = 0 via numerical technique, the MPS estimators of δ, λ, and ϑ are provided. The MPS estimators are derived by partly differentiating the natural logarithm of the UEL distribution's product spacing function with respect to population parameters and using an optimization algorithm (conjugate-gradient or Newton-Raphson optimization).
The ML estimate (MLE), MPS estimate (MPSE), and Bayesian estimate of δ, λ and ϑ are calculated. Then, the biases and MSEs of the different estimates were determined as well as confidence intervals (CIs) were obtained. In MLE, the CIs by asymptotic CIs and bootstrap CIs with different algorithms as bootstrap-P (BP) and bootstrap-T (BT) were calculated. In MPS the confidence lengths (CLs) by asymptotic CIs are calculated. While in the Bayesian estimation method, the credible CIs (CCIs) by the highest posterior density CLs are obtained (see [32,33]). Simulated results of bias, MSE, and length of CI (L.CI) were scheduled in Tables 3  and 4, and Fig 2. Some concluding remarks are noticed as the following: 1. As n increases, the bias and MSE across all estimates decrease.
2. The measures of MPSE are better than MLE.
3. MLEs and MPSEs are not as good as the metrics of Bayesian estimates.
4. The length of asymptotic CLs for MPSE has smaller values than the length of asymptotic CLs for MLEs.
5. The length of credible CLs for Bayesian has smaller values than the length of asymptotic CLs for MLE and MPSE.
6. The length of bootstrap CLs gets the smallest value of the length of CIs.
7. The length of bootstrap-T CLs gets the smallest value of the length of bootstrap-P CIs.

PLOS ONE
Let Y 1 ,. . .,Y n be n independent random variables, where each Y i , i=1,. . . n follows the PDF (17) with μ i as unknown quantile parameter, and δ, ϑ (unknown shape parameter), where q 2 (0, 1) is assumed to be known, i.e, Y i * UEL(δ, ϑ, μ i ). The quantile μ i of Y i must satisfy the following functional relation in order for the UEL quantile regression model to be defined here: where X i = (1, X 1i , . . ., X (p−1)i ) represents the observations on p known covariates and B = (B 0 , B 1 , . . ., B p−1 ) T is a p-dimensional vector of unknown regression coefficients, p < n. We will suppose that the quantile link function g(.) maps (0, 1) into R and is strictly growing and twice differentiable. There are several choices for the link function g(.). For instance, the most popular link functions are: We only consider the logit as a link function in this paper because the parameters are directly translated into odds. Reference [35] gave their interpretation for logit when is the beta distribution's mean. Then, we can write μ i under the logit link function as follows: :::; n: For assigned q 2 (0, 1), let Θ = (B T , δ, ϑ) be the vector of p unknown parameters to be estimated using the approach of ML. Using the structure of the PDF (17), the log-LF is given by: The MLE of parameters cannot be computed analytically and must be calculated numerically using an optimization algorithm such as Newton-Raphson or quasi-Newton. Under regularity conditions and when n is large, the asymptotic distribution of the MLE is approximately multivariate normal with mean vector (B T , δ, ϑ) and V −1 (Θ) is the variance-covariance matrix, where is the expected Fisher information matrix.

Data applications
This section contains three real-world data sets that demonstrate the UEL distribution's modelling capabilities. Quantile regression modelling is addressed in the first and second data sets, while data modelling is addressed in the third data. Unit-Weibull regression (UWR), beta regression (BetaR) model, original linear regression (OLR) and quantile regression (QR) are three competitor regression models that are compared to the UEL quantile regression model for the first and second data. The QR has been obtained by "rq" function in "quantreg" package in the R program (see [36])). To compare between the UEL and the considered regression models, we compute Akaike's information criterion (AIC), and Bayesian information criterion (BIC). For the third data set, the fits of the UEL distribution are compared with some other competitive models to illustrate the potentiality of the UEL model.

Quantile regression modeling for confidence of mock jurors in their verdicts
Data with responses of naive mock jurors to the conventional two-option verdict (guilt vs. acquittal) versus a three-option verdict setup (the third option was the Scottish 'not proven' alternative), in the presence/absence of conflicting testimonial evidence. A data frame containing 104 observations on three variables. The source of this data is Ref. [37] The following covariates are connected to this response variable Verdict: x 1 is a factor indicating whether a two-option or three-option verdict is requested, where two-option is -1, and three-option is 1.
Conflict: x 2 is a factor. Is there conflicting testimonial evidence? If no is 1, yes is -1. The regression framework assumed for μ i is provided by Logit where μ i denotes the median q = 0.5 in the UEL regression (UELR) model. Table 5 gives the MLEs and SEs along with the AIC and BIC measures for the UELR, UWR, BetaR, OLR, and QR models.

Quantile regression modeling for proportion of household income spent on food
Here, the proposed data on the proportion of income spent on food for a random sample of 38 households in a large US city. The source of this data is Ref. [38]. The covariates connected to this response variable are Income: x 1 is household income Persons: x 2 is the number of persons living in the household. The dependent variable is I(food/income). The regression pattern assumed for μ i is as below Logit μ i = B 0 + B 1 x 1i + B 2 x 2i , i = 1, . . ., 38 where μ i denotes the median q = 0.5 in the UELR model. Table 6 gives the MLEs and SEs along with the AIC and BIC measures for the UELR, UWR, BetaR, OLR, and QR models. We note that UELR has a
The MLEs and their SEs for the investigated models are provided in Table 8.   For all fitted models using the Covid-19 data set, Table 9 shows the values of the AIC, BIC, CAIC, HQIC, CVMC, ADC, and KSD statistics, as well as their related P-values. Table 9 shows that the UEL distribution has the largest negative AIC, BIC, CAIC, HQIC, and P-value, as well as the smallest KSD, CVMC, and ADC values when compared to the other models used to fit the Covid-19. The empirical, histogram and PP-plot for the UEL distribution for Saudi Arabia's Covid-19 data are shown in

Summary and conclusion
This study proposes the unit-exponentiated Lomax distribution, based on an appropriate transformation, which is useful for modeling data on the unit interval. Some mathematical characteristics of the UEL distribution are explored, such as moments, PWMs, IM, entropy measures, and stress-strength reliability. The maximum likelihood, Bayesian, and maximum product of spacing methods are employed for estimating the parameter of the suggested distribution. Results from simulations show that the criteria measurements of Bayesian estimates are preferred to comparable alternative estimates. It can be demonstrated that the UEL regression model is a reasonable alternative to unit-Weibull regression, beta regression, and the original linear regression models when using mock jurors and food spending data. Using Covid-19 data, the proposed model outperforms the beta, Kumaraswamy-Kumaraswamy, Topp-Leone generalized exponential, Marshall-Olkin Kumaraswamy, Topp-Leone Weibull Lomax, and type II power Topp-Leone inverse exponential across a variety of comparison criteria. In future research, we will discuss the application of UEL distribution based on these points as [43][44][45][46][47][48][49][50][51].